Audio thumbnails for spoken content without transcription based on a maximum motif coverage criterion

نویسندگان

  • Guillaume Gravier
  • Nathan Souviraà-Labastie
  • Sebastien Campion
  • Frédéric Bimbot
چکیده

The paper presents a system to create audio thumbnails of spoken content, i.e., short audio summaries representative of the entire content, without resorting to a lexical representation. As an alternative to searching for relevant words and phrases in a transcript, unsupervised motif discovery is used to find short, word-like, repeating fragments at the signal level without acoustic models. The output of the word discovery algorithm is exploited via a maximum motif coverage criterion to generate a thumbnail in an extractive manner. A limited number of relevant segments are chosen within the data so as to include the maximum number of motifs while remaining short enough and intelligible. Evaluation is performed on broadcast news reports with a panel of human listeners judging the quality of the thumbnails. Results indicate that motif-based thumbnails stand between random thumbnails and ASR-based keywords, however still far behind thumbnails and keywords humanly authored.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ThumbnailDJ: Visual Thumbnails of Music Content

Musical perception is non-visual and people cannot describe what a song sounds like without listening to it. To facilitate music browsing and searching, we explore the automatic generation of visual thumbnails for music. Targeting an expert user groups, DJs, we developed a concept named ThumbnailDJ: Based on a metaphor of music notation, a visual thumbnail can be automatically generated for an ...

متن کامل

Summarizing Audiovisual Contents of a Video Program

In this paper, we focus on video programs that are intended to disseminate information and knowledge such as news, documentaries, seminars, etc, and present an audiovisual summarization system that summarizes the audio and visual contents of the given video separately, and then integrating the two summaries with a partial alignment. The audio summary is created by selecting spoken sentences tha...

متن کامل

Music Thumbnailer: Visualizing Musical Pieces in Thumbnail Images Based on Acoustic Features

This paper presents a principled method calledMusicThumbnailer to transform musical pieces into visual thumbnail images based on acoustic features extracted from their audio signals. These thumbnails can help users immediately guess the musical contents of audio signals without trial listening. This method is consistent in ways that optimize thumbnails according to the characteristics of a targ...

متن کامل

Advances in Profile Assisted Voicemail Management

Spoken audio is an important source of information available to knowledge extraction and management systems. Organization of spoken messages by priority and content can facilitate knowledge capture and decision making based on profiles of recipients as these can be determined by physical and social conditions. This paper revisits the above task and addresses a related data sparseness problem. W...

متن کامل

A Discriminative Approach for Unsupervised Clustering of DNA Sequence Motifs

Algorithmic comparison of DNA sequence motifs is a problem in bioinformatics that has received increased attention during the last years. Its main applications concern characterization of potentially novel motifs and clustering of a motif collection in order to remove redundancy. Despite growing interest in motif clustering, the question which motif clusters to aim at has so far not been system...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014